KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions
نویسندگان
چکیده
KUPS (The University of Kansas Proteomics Service) provides high-quality protein-protein interaction (PPI) data for researchers developing and evaluating computational models for predicting PPIs by allowing users to construct ready-to-use data sets of interacting protein pairs (IPPs), non-interacting protein pairs (NIPs) and associated features. Multiple filters and options allow the user to control the make-up of the IPPs and NIPs as well as the quality of the resultant data sets. Each data set is built from the overall database, which includes 185 446 IPPs and ∼1.5 billion NIPs from five primary databases: IntAct, HPRD, MINT, UniProt and the Gene Ontology. The IPP set can be set to specific model organisms, interaction types and experimental evidence. The NIP set can be generated using four different strategies, which can alleviate biased estimation problems. Lastly, multiple features can be provided for all of the IPP and NIP pairs. Additionally, KUPS provides two benchmark data sets to help researchers compare their algorithms to existing approaches. KUPS is freely available at http://www.ittc.ku.edu/chenlab.
منابع مشابه
Discovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملGenome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset
High-throughput studies of protein interactions may have produced, experimentally and computationally, the most comprehensive protein-protein interaction datasets in the completely sequenced genomes. It provides us an opportunity on a proteome scale, to discover the underlying protein interaction patterns. Here, we propose an approach to discovering motif pairs at interaction sites (often 3-8 r...
متن کاملThe Negatome database: a reference set of non-interacting protein pairs
The Negatome is a collection of protein and domain pairs that are unlikely to be engaged in direct physical interactions. The database currently contains experimentally supported non-interacting protein pairs derived from two distinct sources: by manual curation of literature and by analyzing protein complexes with known 3D structure. More stringent lists of non-interacting pairs were derived f...
متن کاملVariational Calculations for the Relativistic Interacting Fermion System at Finite Temperature: Application to Liquid 3He
In this paper, at first we have formulated the lowest order constrained variational method for the relativistic case of an interacting fermion system at finite temperature. Then we have used this formalism to calculate some thermodynamic properties of liquid in the relativistic regime. The results show that the difference between total energies of relativistic and non-relativistic cases of liqu...
متن کاملAnalyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction.
The concept of 'protein function' is rather 'fuzzy' because it is often based on whimsical terms or contradictory nomenclature. This currently presents a challenge for functional genomics because precise definitions are essential for most computational approaches. Addressing this challenge, the notion of networks between biological entities (including molecular and genetic interaction networks ...
متن کامل